risk prediction
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Montserrat (0.04)
- (3 more...)
LungEvaty: A Scalable, Open-Source Transformer-based Deep Learning Model for Lung Cancer Risk Prediction in LDCT Screening
Brandt, Johannes, Chevli, Maulik, Braren, Rickmer, Kaissis, Georgios, Müller, Philip, Rueckert, Daniel
Lung cancer risk estimation is gaining increasing importance as more countries introduce population-wide screening programs using low-dose CT (LDCT). As imaging volumes grow, scalable methods that can process entire lung volumes efficiently are essential to tap into the full potential of these large screening datasets. Existing approaches either over-rely on pixel-level annotations, limiting scalability, or analyze the lung in fragments, weakening performance. We present LungEvaty, a fully transformer-based framework for predicting 1-6 year lung cancer risk from a single LDCT scan. The model operates on whole-lung inputs, learning directly from large-scale screening data to capture comprehensive anatomical and pathological cues relevant for malignancy risk. Using only imaging data and no region supervision, LungEvaty matches state-of-the-art performance, refinable by an optional Anatomically Informed Attention Guidance (AIAG) loss that encourages anatomically focused attention. In total, LungEvaty was trained on more than 90,000 CT scans, including over 28,000 for fine-tuning and 6,000 for evaluation. The framework offers a simple, data-efficient, and fully open-source solution that provides an extensible foundation for future research in longitudinal and multimodal lung cancer risk prediction.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Germany > Hamburg (0.04)
Cost-Aware Prediction (CAP): An LLM-Enhanced Machine Learning Pipeline and Decision Support System for Heart Failure Mortality Prediction
Yu, Yinan, Dippel, Falk, Lundberg, Christina E., Lindgren, Martin, Rosengren, Annika, Adiels, Martin, Sjöland, Helen
Objective: Machine learning (ML) predictive models are often developed without considering downstream value trade-offs and clinical interpretability. This paper introduces a cost-aware prediction (CAP) framework that combines cost-benefit analysis assisted by large language model (LLM) agents to communicate the trade-offs involved in applying ML predictions. Materials and Methods: We developed an ML model predicting 1-year mortality in patients with heart failure (N = 30,021, 22% mortality) to identify those eligible for home care. We then introduced clinical impact projection (CIP) curves to visualize important cost dimensions - quality of life and healthcare provider expenses, further divided into treatment and error costs, to assess the clinical consequences of predictions. Finally, we used four LLM agents to generate patient-specific descriptions. The system was evaluated by clinicians for its decision support value. Results: The eXtreme gradient boosting (XGB) model achieved the best performance, with an area under the receiver operating characteristic curve (AUROC) of 0.804 (95% confidence interval (CI) 0.792-0.816), area under the precision-recall curve (AUPRC) of 0.529 (95% CI 0.502-0.558) and a Brier score of 0.135 (95% CI 0.130-0.140). Discussion: The CIP cost curves provided a population-level overview of cost composition across decision thresholds, whereas LLM-generated cost-benefit analysis at individual patient-levels. The system was well received according to the evaluation by clinicians. However, feedback emphasizes the need to strengthen the technical accuracy for speculative tasks. Conclusion: CAP utilizes LLM agents to integrate ML classifier outcomes and cost-benefit analysis for more transparent and interpretable decision support.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Europe > Sweden > Västerbotten County > Umeå (0.04)
Risk Prediction of Cardiovascular Disease for Diabetic Patients with Machine Learning and Deep Learning Techniques
Accurate prediction of cardiovascular disease (CVD) risk is crucial for healthcare institutions. This study addresses the growing prevalence of diabetes and its strong link to heart disease by proposing an efficient CVD risk prediction model for diabetic patients using machine learning (ML) and hybrid deep learning (DL) approaches. The BRFSS dataset was preprocessed by removing duplicates, handling missing values, identifying categorical and numerical features, and applying Principal Component Analysis (PCA) for feature extraction. Several ML models, including Decision Trees (DT), Random Forest (RF), k-Nearest Neighbors (KNN), Support Vector Machine (SVM), AdaBoost, and XGBoost, were implemented, with XGBoost achieving the highest accuracy of 0.9050. Various DL models, such as Artificial Neural Networks (ANN), Deep Neural Networks (DNN), Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), and Gated Recurrent Unit (GRU), as well as hybrid models combining CNN with LSTM, BiLSTM, and GRU, were also explored. Some of these models achieved perfect recall (1.00), with the LSTM model achieving the highest accuracy of 0.9050. Our research highlights the effectiveness of ML and DL models in predicting CVD risk among diabetic patients, automating and enhancing clinical decision-making. High accuracy and F1 scores demonstrate these models' potential to improve personalized risk management and preventive strategies.
- Oceania > Guam (0.04)
- North America > United States > District of Columbia (0.04)
- North America > US Virgin Islands (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
KAT-GNN: A Knowledge-Augmented Temporal Graph Neural Network for Risk Prediction in Electronic Health Records
Lin, Kun-Wei, Kuo, Yu-Chen, Wang, Hsin-Yao, Tseng, Yi-Ju
Abstract-- Clinical risk prediction using electronic health records (EHRs) is vital to facilitate timely interventions and clinical decision support. However, modeling heterogeneous and irregular temporal EHR data presents significant challenges. We propose KAT -GNN (Knowledge-Augmented Temporal Graph Neural Network), a graph-based framework that integrates clinical knowledge and temporal dynamics for risk prediction. These graphs are then augmented using two knowledge sources: (1) ontology-driven edges derived from SNOMED CT and (2) co-occurrence priors extracted from EHRs. Subsequently, a time-aware transformer is employed to capture longitudinal dynamics from the graph-encoded patient representations. KAT -GNN is evaluated on three distinct datasets and tasks: coronary artery disease (CAD) prediction using the Chang Gung Research Database (CGRD) and in-hospital mortality prediction using the MIMIC-III and MIMIC-IV datasets. KAT - GNN achieves state-of-the-art performance in CAD prediction (AUROC: 0.9269 0.0029) and demonstrated strong results in mortality prediction in MIMIC-III (AUROC: 0.9230 0.0070) and MIMIC-IV (AUROC: 0.8849 0.0089), consistently outperforming established baselines such as GRASP and RETAIN. Ablation studies confirm that both knowledge-based augmentation and the temporal modeling component are significant contributors to performance gains. These findings demonstrate that the integration of clinical knowledge into graph representations, coupled with a time-aware attention mechanism, provides an effective and generalizable approach for risk prediction across diverse clinical tasks and datasets. NTRODUCTION This study was supported by grants from the National Science and T echnology Council, T aiwan (NSTC 114-2221-E-A49-061), the Higher Education Sprout Project of the National Y ang Ming Chiao T ung University and MOE, T aiwan (CGMH-NYCU-114-CORPG2P0072), and Chang Gung Memorial Hospital (CMRPG2P0342).
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Israel (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
MV-MLM: Bridging Multi-View Mammography and Language for Breast Cancer Diagnosis and Risk Prediction
Zheng, Shunjie-Fabian, Lee, Hyeonjun, Kooi, Thijs, Diba, Ali
Large annotated datasets are essential for training robust Computer-Aided Diagnosis (CAD) models for breast cancer detection or risk prediction. However, acquiring such datasets with fine-detailed annotation is both costly and time-consuming. Vision-Language Models (VLMs), such as CLIP, which are pre-trained on large image-text pairs, offer a promising solution by enhancing robustness and data efficiency in medical imaging tasks. This paper introduces a novel Multi-View Mammography and Language Model for breast cancer classification and risk prediction, trained on a dataset of paired mammogram images and synthetic radiology reports. Our MV-MLM leverages multi-view supervision to learn rich representations from extensive radiology data by employing cross-modal self-supervision across image-text pairs. This includes multiple views and the corresponding pseudo-radiology reports. W e propose a novel joint visual-textual learning strategy to enhance generalization and accuracy performance over different data types and tasks to distinguish breast tissues or cancer characteristics(calcification, mass) and utilize these patterns to understand mammography images and predict cancer risk. W e evaluated our method on both private and publicly available datasets, demonstrating that the proposed model achieves state-of-the-art performance in three classification tasks: (1) malignancy classification, (2) subtype classification, and (3) image-based cancer risk prediction. Furthermore, the model exhibits strong data efficiency, outperforming existing fully supervised or VLM baselines while trained on synthetic text reports and without the need for actual radiology reports.
- Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Support Vector Machine-Based Burnout Risk Prediction with an Interactive Interface for Organizational Use
Teodosio, Bruno W. G., Lira, Mário J. O. T., Araújo, Pedro H. M., Farias, Lucas R. C.
Burnout is a psychological syndrome marked by emotional exhaustion, depersonalization, and reduced personal accomplishment, with a significant impact on individual well-being and organizational performance. This study proposes a machine learning approach to predict burnout risk using the HackerEarth Employee Burnout Challenge dataset. Three supervised algorithms were evaluated: nearest neighbors (KNN), random forest, and support vector machine (SVM), with model performance evaluated through 30-fold cross-validation using the determination coefficient (R2). Among the models tested, SVM achieved the highest predictive performance (R2 = 0.84) and was statistically superior to KNN and Random Forest based on paired $t$-tests. To ensure practical applicability, an interactive interface was developed using Streamlit, allowing non-technical users to input data and receive burnout risk predictions. The results highlight the potential of machine learning to support early detection of burnout and promote data-driven mental health strategies in organizational settings.
- South America > Brazil > Pernambuco > Recife (0.06)
- Africa > Democratic Republic of the Congo (0.04)
LIFT: Interpretable truck driving risk prediction with literature-informed fine-tuned LLMs
Hu, Xiao, Lian, Yuansheng, Zhang, Ke, Li, Yunxuan, Su, Yuelong, Li, Meng
This study proposes an interpretable prediction framework with literature-informed fine-tuned (LIFT) LLMs for truck driving risk prediction. The framework integrates an LLM-driven Inference Core that predicts and explains truck driving risk, a Literature Processing Pipeline that filters and summarizes domain-specific literature into a literature knowledge base, and a Result Evaluator that evaluates the prediction performance as well as the interpretability of the LIFT LLM. After fine-tuning on a real-world truck driving risk dataset, the LIFT LLM achieved accurate risk prediction, outperforming benchmark models by 26.7% in recall and 10.1% in F1-score. Furthermore, guided by the literature knowledge base automatically constructed from 299 domain papers, the LIFT LLM produced variable importance ranking consistent with that derived from the benchmark model, while demonstrating robustness in interpretation results to various data sampling conditions. The LIFT LLM also identified potential risky scenarios by detecting key combination of variables in truck driving risk, which were verified by PERMANOVA tests. Finally, we demonstrated the contribution of the literature knowledge base and the fine-tuning process in the interpretability of the LIFT LLM, and discussed the potential of the LIFT LLM in data-driven knowledge discovery.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States (0.04)
- Europe (0.04)
- Transportation > Ground > Road (1.00)
- Transportation > Freight & Logistics Services (1.00)
- Automobiles & Trucks (0.93)
Explainable artificial intelligence model predicting the risk of all-cause mortality in patients with type 2 diabetes mellitus
Vershinina, Olga, Sabbatinelli, Jacopo, Bonfigli, Anna Rita, Colombaretti, Dalila, Giuliani, Angelica, Krivonosov, Mikhail, Trukhanov, Arseniy, Franceschi, Claudio, Ivanchenko, Mikhail, Olivieri, Fabiola
Objective. Type 2 diabetes mellitus (T2DM) is a highly prevalent non-communicable chronic disease that substantially reduces life expectancy. Accurate estimation of all-cause mortality risk in T2DM patients is crucial for personalizing and optimizing treatment strategies. Research Design and Methods. This study analyzed a cohort of 554 patients (aged 40-87 years) with diagnosed T2DM over a maximum follow-up period of 16.8 years, during which 202 patients (36%) died. Key survival-associated features were identified, and multiple machine learning (ML) models were trained and validated to predict all-cause mortality risk. To improve model interpretability, Shapley additive explanations (SHAP) was applied to the best-performing model. Results. The extra survival trees (EST) model, incorporating ten key features, demonstrated the best predictive performance. The model achieved a C-statistic of 0.776, with the area under the receiver operating characteristic curve (AUC) values of 0.86, 0.80, 0.841, and 0.826 for 5-, 10-, 15-, and 16.8-year all-cause mortality predictions, respectively. The SHAP approach was employed to interpret the model's individual decision-making processes. Conclusions. The developed model exhibited strong predictive performance for mortality risk assessment. Its clinically interpretable outputs enable potential bedside application, improving the identification of high-risk patients and supporting timely treatment optimization.
- Asia > Russia (0.14)
- Europe > Italy > Marche > Ancona Province > Ancona (0.05)
- Europe > Russia > Volga Federal District > Nizhny Novgorod Oblast > Nizhny Novgorod (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)